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Aggregation of Uncertainty about Subjective Judgments 
Leonard Clerk Johnson 
University of Washtagton 


When one is called upon to make @ subjective judgment about 4 person, 
object or event, it is common to experience some tegree of uncertainty about the 
precision of the judgment. Judgrenta) uncertzinty plays a large role tn the 
psychology of decision making, end within this context the concept has voce te 
„ good dea! of analysis. in dectston making the judgments of interest often 
are about proportions and probabil ‘ties of various events. in some cases the 
avray of events is discrete (e g the possible diseases that a patient might 
have) and in other cases it is continuous (e.g., the proporticn of people in the 
U.S. whe own foreign automobiles). The work reported here deals with tasks that 
have continuous probability distributions. 

ignorence, risk and ambiguity ere terms used to describe various states of 
uncertainty about population proportions cr probabilities (Yates 4 Zusowski, 
1976). If the distribution over al! possible values of the proportion or rode 
bility ts sharply peaked, the decision maker {5 fairly certain about the appro- 
priate probability and a decision based upon it 1s satd to be made tn @ state of 
riskiness (Luce & te, 1957). Ignorance ts represented by the case fn which 
all oropabtlities are seen as equa'ly likely (Cooms, Oewes & Twershy, 1970). A 
rectangular distribution over the range of probabilities is the technical iefint. 
tion of ignorance, but from a practical standpoint, any broad, reasonably flat 
distribution approximates the condition. Risk and ignorance are extreme cond! - 
tions which are seldom reslized, al! the possible vartations between these 
extremes are said to be states of ambiguity (Elisberg, 1961). Consequently, « 
continuue of uncertainty states can be anvistoned in which ignorance progres- 
stvely develops tate ambiguity and ultimately into risk as the probability 
ast ettut tons become progressively more peaked. Anything that “sharpens” the 
distribution promotes greater certainty shout the point estimate (or vice verse) 
and anything that “flattens” tt promotes greater uncerteinty. 

One factor thet should promote et ther sharpening or flattening of the dis- 
tribution fs the amount of pertinent information the judge hes shout the proba- 
bility that fs to be estimated. Peterson amt Phillips (1966) demonstrated how 
observations pertinent to an event contribute to judges’ knowledge about it and 
how subjective probability distributions (uncertainty) narrow as observations 


proceed. Their subjects’ task was to ent % the proportion of red poker chips 
in a large urn on the basis of 4 sequence of random draws of one chip at a ime. 
Rather than giving a point estimate of the proportion, the judges described their 
subjective probability distributions over the range of possible values of the 
proportion (.00-1.00) after esch draw. This required them to give 33% credible 
jatervals using a method that wil! be descrihed below. When the obtained 
intervals were compared to intervals derived by using Bayes Theorem, it was 
found that given enough draws, participants eventually selected a probability 
that was the same as tat predicted by the normative mode! However, they did 
$0 about half as quickly. That ts, the participants were able to learn about 
the proportion but their speed uf doing so was “conservative” compared to the 
normative model. Conservatism is common in probability revision experiments 
(Phillips, Hayes 4 Edwarcs, 1966; Peterson gesch, 1967, Slowic & Lichtenstetn, 
1971, and usually ts of about the same magnitude as was found in this study. 

Although participants apparently could deal with it, the method Petersor 
and Phillips (1966) used to elicit credible intervals is quite complex: 


The Ss were told to tmagine a large urn filled with poker chips. 
some red and the others Dive. Their task was to make estimates 
.dout the proportion p of red chips in the urn when p was selected 
by a random orocedure such that a]! values between 0 and * were 
equally likely. The Ss were told that they would receive t#formetion 
about the value of p by observing s sequence of chips drawn from 
th urn. Each S's task was to use two merkers to trisect a 
scaled 0-1 cent ten inte three intervals such that ft was equal ly 
likely that p was contained in any of the intervals. 

he fol 89 tratatag procedure was used. The £ displayed « 
practice ble nine red chips and one blue chip. The Ss were 
told to use the sample as a basis for inferences about p. Each § 
was instructed tu set Ais markers at 0.333 and 0.667, and then to 
imagine thet he and two fs would each bet «4 dollar on which of the 
three intervals contained the correct proportion of red chips in 
the urn. Each of the bets nad to be placed on a different interval 
and S was allowed the first chotce. So § picked the 0 to 0.333 
interval, and it was agreed that this tnterval was a bad bet. Next, 
S$ was instructed to move the markers to 9.499 and 0.90], this time 
it was agreed that the sme)! 0 599 to 0.90! interval wes @ bad bet. 
' then explained that the bets for the first settings were not 
equally qood because too little attention hed been paid to the sample 

and blue chips, too much attention was paid to the sample for 

the second settings. Then ( instructed the Ss to set the markers so 
that the three intervals would be equally qood bets, „ tf the § 
were left with the last chotce of an interval be would consider his 
bet as fair as the other two. As « second practice trial, Ss 
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were told that 4 new urn had been selected and that a sample had 

been drawn which contained no red chips and six blue chips. They 

were again instructed to set the markers to yteld three intervals. 

ach of which would be an equally good bet. The t interrogated the 

8 to insure that they understood the concept of three equally good 
ts. 

At the beginning of each sequence of % draws the Ss were 
required to set the markers at 0.333 and 0.667. They then revised 
and recorded these settings after observing each successive chip in 
the sequence of draws. The results of previous draws tn the sequence 
were displayed throughout the sequence (p. 19-20). 


Similarly complex methods have been used tn other studies (Beach, 1975) and, 
with @ good des! of practice, people often can become proficient in the use of 
intervals although they tend in st cases to be moderately inaccurate when 
compared to some objective, statistical standard (e.g.. Slowic, Fischhoff & 
Liechtenstein, 1977). 

The problem with credible intervals is teat the experimenter tells the judge 
the criterton to use, e 4 33% tnterval, 4 99 taterval or whatever. While 
this is meaningful to the experimenter it is not necessartly so to the judge * 
unless he is given extensive, time-consuming training. To overcome this 
problem Beach and Solak (1969) invented the “equivalence interval” (£1). This 
is the tnterval around some point estimate that the judge feels is “reasonably 
likely” to contain the true value of whatever ts being estimated. Put another 
way, If the true value turned out to lte within the £1, the judge would count 
the estimate a5 essentially correct. 

A ber of studies support the contention that the (I is „ useful measure 
of a judge's differential uncertainty about the accuracy of subjective judgments. 
Beach and Solak (1969) presented people with arithmetic problems (e.g., 87% of 
96 « 83.5) and asked them to put Is around the correct answers to indicate the 
range within which they would regard someone's answer a5 essentially correct if 
the person were to work the problem in his head. It was found that the intervals 
were 4 constant proportion, K, of the magnitude of the correct answer, C, (that 
is EL/C = &) and that & was different for difficult and easy problem. 

Leestadius (1970) had people examire lists of 15 nuubers of efther high or 
low variance. They were given the correct mean of each list. Then, for each 
list they were asked to specify an El around the mean within which an unaided 
judge's estimate of the mean would be close enough to be “in the bel lpark.” 

The El's were stqnificantly larger for high wartance lists than for low variance 
lists, just as credible intervals would be. 


Beach, Geach, Carter and Barclay (1974) further examined the properties 
of ti's. Trey found that the “law of proportionality” obtained by Beach and 
Solak (1969) held only fer prothetic continue, @5 would de expected, but that 
even for methathetic continua the Is were larger for unfamiliar events than 
for tnt ter ones. When judges set [1's around their own estimates of « popule 
tion proport ton based on a random sample, the I, decreased as either the sample 
size increased or as the proportion approached |.00 or , both conditions 
would affect a credible interval in the same way. Judgments of people's ages 
yielded e for strangers aces and o for strangers judgments of the 
judge's own age. {I's for the sertousness of various life events (Holmes 4 
Rahe, 1967) ytelded & « 3) and for the seriousness of diseases (Wyler, M"esuda 
‘ Holmes, 1968) & © 11. A final study showed that the size of the £1 around 
hypothetical sums of money that one could innertt or give avay wes tofluenced 
by one's supposed wealth or poverty and by whether the one was or was not 
involwed tn a gamble. While this fs 4 Strange putpourr! of topics, the studies 
nonetheless demonstrate that fs vary with the variables that tote common 
sense and statistics dictate they should. 

with the exception of the proportion estimation study tn Beach et a!. 
(1974), e ee always been placed around points that the experimenters 
specified rather than having the judoes use them to indicate uncertainty about 
the accuracy Of their own judq@ments. However, the results of the proportion 
estimation study showed that, when used in the latter way, the EI s behaved as 
credible intervals would have. 

In this paper El's will be used to examine how judges aggregate uncertainty 
about the accuracy of their cwn subjective judgments when these component 
judgments are compounded into overall! judgrents. for example, suppose s con- 
tractor were to make 4 series of “educated quesses” about the cost of ver tous 
components of some job ana then sum these Quesses using penci! and paper to get 
en overall estimate. ach guess, weer educated it may be, ts stil) @ guess 
and as 4 result has some degree of accompanying uncertainty. Therefore, the 
sum of the guesses also must eve accompanying uncertainty. The questions of 
interest are: In a task like this can people agqregete uncertainty? If they 
can, 16 it possible to mode! the process? Snes aggregation differ for easy and 
difficult judgments? oes the number of component judgments influence the 
aggregation? 


Aggregation of uncertainty has been examined "ost thoroughly tn the Bayesian 
revision studtes (Peterson 4 Beach, 1967, Slowic « Lichtenstein, 1971, Slowic, 
Fiscnnoff & Lichtenstetn, 1977). However, the normative mode! used in that 
research is not applicable to the present form of aggrecation. indeed, no 
normative mode! exists for this situation. Therefore we can only conjecture 
about possible models and then empirically seet the best 

Anderson and his assoctates have examined the ways in which information, 
s opposed to uncertainty, is aggregated fn various situations. They heave 
found that this kind of aggregation ost often can be described a5 efther 
additive or averaging 

Additivity means that aggregation ts best described as 4 process in which 
information ts merely summed as it is received. for example, 4 person's prefer- 
ence for a lunch consisting of «4 certain kind of sandwich and 4 certain kind of 
drink is the sum of his preference for the two seperately (Shanteau 4 Anderson. 
1969). 

Averaging means that aggregation e best described a5 4 process in which 
information ts pooled. for exemple. a person's net tapresston of another person 
seems to be the average of the other's positive and negative charactertstics 
(Anderson & Alexander, 197!). 

Because these two descriptions of aggrecation are staple and because they 
have been found adequate for 4 broad wartety of tasks. it ts reasonable, in 
lieu of @ normative aode! for uncertainty eqgregetion, to consider them as the 
leading hypotheses for examining uncertainty aggregation. 

The research strategy consisted of presenting participants with series 
(strings) of arithmetic problems and asking them to work each problem in their 
heads. write down the answer. and place an LI aroun! it. Then they used pocket 
calculators to sum the answers to the component problems and placed an £1 
around that sum. ‘Some strings were predominantiy easy problems and some were 
predominantly difficult. for reasons that wi!! become clear, the first experi - 
ment used strings of five component problems. the second used three, the third 
two and four. 


txperiment | 


In the first study the sequences fad five component problems. {ist immted 
answers to these preblems were aggregsted on pocket calculators into either « 
sum or an average. f£1's were obtained for the estimated answers for each 
component problem (él) as wel! for the sum 161,0 or averege 1 % OF interest 
was whether EI, and £1, were related in some orderly way to the , of the 
component problems, implying that uncertainty wes agyregeted in some manner, and 
whether the additional step of dividing to obtain an average influences fl, in 
any way | 
to 

The method followed in this experiment ii be described e some dete 
both because ft 1s complicated and decause the subsequent experiments were 
conducted in a simtlar way. 
Matertals 

To provide « seaningful contest for the problems each sequence was assoc! 
ated with a short cover story. There were four stories. One required the judge 
to imagine hieself to be standing in s check-out line at « Supermarket. ite 
plans to pay cash and wisnes to est teste the total cost of the grocertes to see 
if he has sufficient money. This involves solving component problems such as 
"68 Ibs. of dog food at lic per pound” or avocados at e each,” etc. 
Similar stories and problems tnvolved a contractor (°2998 electrical out lets 
at 99¢ each”), judgments of people's weight | iat would 2 20 year old male 
who was 6 feet tal! weigh?) and straightforward percentage problers (“What is 
99% of 29987"). Appendix A contains these stories anc representative set of 
component problems 

A two-step pilot study was conducted to obtain a poo! of difficult and 
easy problems. in the first, participants were asked to work approximately Su 
problems in their heads and to rate the difficulty of each on a S-potnt scale. 
Although rough, the results indicated that the per cent was the biggest deter 
minant of difficulty, with the size of the number upon which the per cent ersten 
being of less importance. for example, problems requiring that one take 25). 
50 or 991 of 4 number were “easy while those tnvolving 87%, 130 or 47% were 
judged to be “difficult”. Using these results, 4 large poo! (approx. 300) o/ 
problems was constructed. fach problem was placed on « smal! sitp of paper, 
and another grow of participants was asked to sort the problems according to 
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whether they were difficult or easy (without actus! ly solving the problems) 
The results were tested ustne a Dinemia) test items were regarded to be 
difficult or easy tf they were placed tn one cateqor) consistently enough to 
reach a 0.05 significance leve! 

About 80 problems eventua)ly were selected. Drawing from these. six 
sequences of five problems each were constructec Each sequence was constructed 
so that the component problems had answers falling within one of three interve!>s 
along the number line. Two sequences each were constructed for the three 
intervals tested The intervals represented prooresstvely higher orders of 
magnitude (i.e., 10 to 100, 100 to 1000, and 1000 to 10,000), with the result 
that the experiment was able to test the effect of the size of the numbers 
being manipulated. Finally, the sequences were made predominantly difficult 
or easy (one each for each interval) by placing four difficult and one easy 
problem together to form a difficult sequence and four easy and one difficult 
problem together to form an easy Sequence. 

The problems were presented in booklets tn wricn the order of sequence 
presentation as well as the problem order within each sequence were each 
independent ly randomized for every participant. instructions at the end of 
each sequence requested either a summing or averaging of the component probler 
estimates. For each sequence one-half of the participants were instructed 
merely to sum and the other one-half were instructed to compute the average 
For each participant one-half of the sequences required sum and one he 
required averages . 

Procedure 

Participents were seated at desks and given pocket calculators and ¢ 
booklet of eber tente materials. The experimenter gave extensive instruc- 
tions (Appendix 8) that emphasized the purpose of the study, the meaning of Els 
and how to record ther, 2 series of practice problems to permit fam!) tarize- 
tion with the time limits placed on doting the problems and the range of aiffi- 
culty of the problems. It is important to note that the calculators were only 
used to sum (average) the est tested answers to the component problems. they 
were not used on the Els. Use of the calculators insured thet the sums 
(averages) were mathematically accurate so that doubts about tnaccurste 
adding (or dividing) would not contribute to the 41611.) 

For each component prob les in e sequence. participants were giver 20 
seconds to make an est taste of the answer and WO) seconds to place an El. 


around this answer Participants were spectfica!l!y test ec ted to make use of 
the full 20 seconds and were not 4! lowed to proceed from the estimate to the 


interval unt i the time nad elapsed Similarly, periiciperts did sot begin 
the next component prcdlem unti! the 30 seconds allotted for the interve! 
elapsed. After each of the 5 component problems had been completed, partic! 
pants used the calculators to compete « sum or average of their estimetes and 
t ner placed an el. or el, around this value. 

Participants 

Participants for „ experiments to be described were solicited through oF 
advertisement in the University newspaper. Both students and nonstudents were 
represented in the resulting poo! and a were paid three dollars per Nour for 
participation. Taenty-eiont persons were involved in this study of whom six 
were subsequently dropped from analysis because of their apperent failure to 
comprehend the instructions and/or because they produced uninterpretable date 
(e g their (I s did not surround the estimate). 

Results 

To examine the effect of the difficulty manipulation the magnitude of ‘he 
point estimate must be taken into account, brew tous research shows thet £1 + 
increase as the magnitude of the point estimate increases (Beach & Solak, 1969, 
Geach et 41 1974). Te do this each EI is divided by its accompanying point 
estimate and the result. «+, 1s submitted to analys's. 

To test the difficulty manipulation the mean * for el. and el, was 
computed both for the easy and for the difficult sequences for each perticipant 
for 19 owt of 2) participants the wen k for the difficult strings was larger 
than for the easy strings (p 00 by a sign test). The overall mean & for 
the difficult strings was .|7 and for the easy strings it was 06. This result 
is congruent with those obtained by Beach and Solak (1969). 

Using the ratio of El. to the products, sums, or averages of the El. for 
each sequence {t is possible to test the adding and averaging hypotheses. 
Figure | shows the relative frequency of each value of these ratios across 
sequences and across perticipants. A ~atio of 1.00 indicates that the hypothesis 
in question is 4 good fit. While neither of the hypotheses is really @ very 
good fit, the adding hypothesis clearly ts better than the averaging hypothes:. 
Thirty-nine per cent of the ratios lie within the .50-!.50 interval around 1.00, 
while for the averaging hypothesis this interval contains only 4 per cent 
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Figure | 
Comparison of Additive and Averaging Model for the 
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Figure 2 
Relative Frequency Histocram for Ratios Between 
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Of course. the sost striking thing sont these distributions ts thetr 
skewness. inspection of the raw data reveals thet this results from a relative!) 
few very extreme values of both fl. and 11. These are not sttrtbeted e to any 
particular participants, problems or sequences, they appear merely to be error 
However, when used to calculate ratios these values lead to extremely large 
or sma)! ratios, depending on whether it is te fl. or the 11. that ts extreme 

Analysis of the ratios is hampered by these extreme values Specifically. 
it ts not possible to use standard descriptive statistics to summarize the data 
In tuts case, the wen ratio ts 2.2! The litera! interpretation of this value 
would tel that the participants inflated the sum of the component uncer- 
tatnties by s factor of roughly two. Gut this clearly ts not the case. 

Fiqure 2 expands the additive nistooram in the critical region around one. 

This plot further emphasizes the fact thet netther the additive nor the 
averaging hypothesis is truly adequate. in fact. the mode (0.45) tes between 
the theoretically appropriate value of |.0 for adding and 0.2 for averaging. 
But, in any case, the majority of responses do not lead to 4 ratio that |s 
very much greater than |.3. Figure 2 also snows just Now inclusive these date 
are. The histograms reveal that even the “reasonadie responses form into an 
undifferentiated pattern in the region around and slightly below one = There 
could be many reasons for this, but before jumping to any premature theoretical 
speculations it is best to consider the simple possibility that agarvaating 
over five component problems simply s too difficult for peaple to do well. 

To test this possibility a second experiment was conducted in which the Cast 
was made less complex. 

insert Figure 2 about here 


Experiment 2 


The task was staplified in two ways. First, the number of component 
problems was reduced from five to threr. Second, participants were not asked 
to compute averages for any of the sequences. in addition to these changes, 
for the two “construction” sequences (See Appendix A) the procedure was teres 
so that partictpants worked only two problems, found the sum of these answers 
and placed an interve!l around this suv. A third probler was worked, its answer 
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was added to the former sum and then an fl. was placed around this fina! sum. 
The construction problems were, therefore, „ seperate experimental manipulation 
within the body of the basic experiment. These sequences of length te were 
included tn case Strings of three component probes proved to be too difficult. 

In al) other respects the experiment wes exoctly the same as in 
Experiment |. 
Participants 

Twenty-eight participents were drawn from the aforementioned poo! of peop! 
who answered the newspaper advertisement. ‘ix were subsequently dropped due to 
uninterpretable data. leaving » * 22. 
Results 

Repeating the previous analysis for the difficulty mantgulation showed that 
20 of the 22 participants had larger wean k's for difficult sequences than for 
eacy sequences (p « 0.00 by sign test). The overal! aman & for difficult 
sequences was 0.17 and for easy was 0.06. 

Fiqure 3 shows the relative frequencies of the values of the ratio 81/1. 
over problems and participants. 

Insert Figure 3 about here 

Comparing this histogram with that obtained in Experiment | it ts clear 
that the additive aypothests ts more effective for the three component case. 
Specifically, the general shape of the distribution ts sore in line with 
expectations in that the wedian and mode are coincidents! This indicated that 
the obtained distribution is wre stable (te greeter consistency tn the 
dats). Furthermore, there is « serked increase (61° ws. 39%) In the number of 
ratios in the 1.00 + interval 

Although there are only two data potnts (two sequences) per participant. 
the effects of reducing the sequences te two component problem can be examined 
preliminartly using the sequences for which two problems were worked, their 
answers surmed 1%). a third problem worked, and the latter added to the 
first sum 1% The ratio of (l, and the sum of e, for the first 
two problems was computed for each of the te sequences for each participant 
As was true for the three component sequences, 617 of the ratios for the tuo 
component sequences ite in the |.00 + 0.5 tnterwal his suggests that there 
is nothing gained by reducing the sequences from three to two components. 
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Continuing to consider these special sequences--when the answer to the 
third problem ts added to the sum of the answers of the first two it should be 
very like aggregetiag uncertainty for 4 two component sequence. in fact, this 
proved to be the case. Apparently, uncertainty ebout « sum of problem answers 
added do yet tler uncertain problem answer {5 aggregated similarly to the 
Staple two problem case. 

In Summary, the reseits of this experiment show that the responses of 
participants ca be ee effectively described by an additive hypothesis when 
task complexity is reduced. The teproved effectiveness of the additive hypothes's 
in the transition from five to three components was also found for the special 
two and three component sequences studies in this experiment. The two component 
data cannot be considered as stable since there are only a very few points. 
Consequently, it seems reasonable (Lo question whether two component sequences 
wight reves! « stil! greeter ieprowement in the effectiveness or an additive 
hypothes!s (Expertment 3a). 

In addition, it would be useful to obtain some ides of the effects of 
manipulations. The transition from Experiment | to Experiment 2 involved both 
4 change in the number of components and excl.ded the averaging operation. it 
could be that for some reason participants in Experiment | were less able to 
aggregate uncertainty because of the averaging operation rather than because of 
the iarge number of components. ‘wreover, simple compulsivity dictates that 
four component sequences be examined to see if performance approximates that on 
five component sequences or three component sequences. Another study (3b) using 
four components and providing some tentative indications of the effect on the 
process due to 4 qultiolying operation was undertaken. 


Experiment je 


The format for this eser test was the same as for the previous two. 
There were d sequences with 2 component problem each Sequence difficulty 
was manipulated by having 2 difficult ‘tems in the difficult sequences and ? 
easy ones in the easy sequences Additionally, there were sequences with « 
component from each difficulty level that were classed as having aixed diffi - 
culty. Of the eight sequences. four were wixed, two were easy, and two were 
difficult. As before, the participants worked the component protiems placing 
El around each, found a sum using the calculetors, and formed an I sround 
this sum. 


Following administration of these etoht sequences there was 4 second 
experimenta! condition in whitch staple percentage problems were presented. 
Participants worked each problem and for each were given a number to add to 
their answer. An equivalence interval was then placed around the sum. Three 
of these staple problems were easy and three were difficult. The size of the 
number to be added was counterdalanced within the difficulty levels so that 
the three orders of magnitude discussed im Experiment | were approximated. 
Since the numerical constant had no uncertainty associated with ft, this 
procedure can be viewed as 4 one-component aggregation tasé. 

Participants 

Twenty participants were ots aed from the poo! Two were subsequently 
dropped from the analysis, leaving n= 18. 

Results 

As in previous experiments, the manipulation of uncertainty was examined 
using a sign test on the average string k for each participant, using only the 
easy and difficult sequences. for 16 of |? participants the mean k was larger 
for difficult sequences than for casy ones (p « 0.00). The mean & for easy 
sequences was 0.03, for difficult sequences it was 0.19. for the es 
sequences it was 0.10. 

Repeating the previously used analysis of ratios of tl. to 1. resulted 
in @ aman ratio of 0.95. This is very close to the result obtained for the 
three component sequences in Experiment 2 as wel! as for the two component 
61% * 11 problems. This reinforces faith in the sieple adding hypothesis 
(see Figure 4). 


hen a number dictated by the experimenter {5 added to the answer to a 
stagle problem there should be no ‘nm. rease in uncertainty about the accuracy 
of the resultant sum However, because this sum has been increased the EI ts 
expected to increase along with it (Beach | Solak, 1969). Therefore, to ser 
if the part!. tpents actual wncertataty remained unchanged through the adding 
operation, the sean k for the CI, was compared to the mean k for the EI, 
There was no difference, indicating that for single component sequences with « 
non-uncertain operation appended the operation does nut increase relative 
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Figure 4 


Relative Frequency Histogram of Ratios fur the 


wo Component Sequences of Experiment ja 


1. 


weertainty. for to remain unchanged the el. @ust have increased. Althow 
this increase 1s illogical, it is consistent with the results obtained by 
Beach . Solek (1969). 

Laperiment jb 

In this Study the sequences had four components and difficulty was not 
manipulated. Rather, the sequences were arranged So that the Gro sequences 
for each cover story were as closely matched with regara to difficulty as 
possible. in addition, the “9rocery” problems were set up so tat partic!- 
pants worked the sequence normally and then were required to aultiply their 
answer by four and place an interval around this final walue. The gultiplice- 
tion was consistent with the cover story (t. e., compute the monthly bill if 
the first answer were your weekly bi)!) and was designed te be both a 
test of the mantpulattons effect and a variation in the procedure to determi or 
if the results tn Experiment | could be explained on thet basis. The grocedure 
tn all other respects was the same a5 in the previous studies. 

Participants 

Twenty participants were obtained from the participant pool. They were 
all used in the data analysis. 
Results 

The method used in previous experiments wes applied to the dala and the 
ootained frequency distribution ts shown in Figure 5. The walue of 5) per cont 
in the interval |.00 + 0.5 as well as the genera! shape of the curve are 
more stet er to the two and three commonent experiments than wo the five 
component case. 

A comparison of the fl. (sum EI) and the fl. (multiplied £1) was made 
using a simple ratio. It seems reasonable that participants would increase 
their intervals >) four when they multiplied, essentially paralleling their 
manipulation of the Fl's. A ratio of 4:1 would indicate that this tn fact 
was taking place. Using al! of the date availeble ( © SC) yielded « mean 
ratio of 3.15. This tndicates that participants did increase their intervals 
but that they wore unwilling to expand their intervals to the ful! amount 
that an additive hypothesis would spectfy as appropriate. 
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Although the preceding Studies demonstrate tire viability of on additive 
move! for the process of uncertainty aggregation, the results ere incomplete 
for two reasons. First, since the experiments ere a!! conducted using 
students who were rot necessarily skilied in the techniques of estimation, 1) 
is natural to question the generalizability of the results. “Rea! world 
experts, indiviuuals who as 4 part of their professional activities make and 
process estimates on a daily basis, aay be different from a student populetion 
that has no real interest in the task or process being studied. 

Second, the experiments thus far have been designed with the hope that 
the data would possess low enough variability to allow the experimenter to 
infer a descriptive mode! for the »rocess. in fact, the level of variability 
has proven to be quite Nign. Because no Systematic interviews were conducted 
with the students, t is mot possible to cross check the additive aypothesis 
with the subjective impressions of the partictpants themselves. 

As a result of these difficulties a fourth experiment was undertaken. 

Experiment | 

The objectives for this experiment were tuofold: (1) to test the * 
plausibility of an additive move! for uncertainty aggregation using eee 
with ell ceweloped estimating skills, and (2) to structure the data collection 
process tn a way that would allow the experimenter to compare objective 
results of the experiment with the participant's ouservations of what he 
veliewed b+ was doing (his subjective results). 

Participants 

The cooperation of four practicing architects was obtained for this 
expertaent. Although they received no financta! remuneration, their enthus | ase 
during the experiment and their subsequent interest in the results actes 
thet they were bighly sotiveted. 
atertals 

A floor plan of « large clinic wes obtained and 4!! aeasurements renoved. 
Seven room were chosen from ammpne those represented on the plan. Since the 
actus! surface area in these rooms «as eld fairly constant (mean size of 
room was 15.6 2. 9 6.4 29. the difficulty of estimeting the surface ares 
representes could be unambiguously santpulated by varying the complexity of 
the perimeter. An architect who served as an advisor to Uris experiment rete 
the chosen shapes for difficulty so that 4 manipulation of compenent problem 
difficulty enalogous to thet used in the carlter excertments co le be carried 
out. 
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Presedure 

The architects participates indtvtuually in @ conference room provide. 
by the firm that esployed them. The experisent consisted of three phases 
introduction to the task and training in Lis, the task, and @ sebriefing 
Interview. 

The training and introduction pnese was stet ler to that given the student 
participants in previous experiments. 

The task phase consisted of two trials. (ach trial involves the area 
estimation and an ( specification for each of three rooms. following the 
third room in each trial the participant was informed of the tota! estimate 
surface area of the three rooms (i.e., the sum of his turee point estimates) 
and was asked to plece % CI around this sum. 

The interview phase consisted of two parts. in an imitial part the 
participant was encouraged to describe the strategy he used in El specification 
and aggregation without any specific direction being offered by the exper) - 
wenter The role of the experimenter in this section was to faci)itete the 
discussion by restating the strategies a5 they were given, in order to 
encourage the perticipent to continue, amd by offeriny comments desimmed to 
focus the perticipent's attention on aspects of his Strategy that were not 
clearly descrived. 

The second part of the intervie: wes structured. Although an unstructi om 
format allows participants latitude in how they describe their strategies. |¢ 
often can occur that they are unable to do so with any precision. The 
structured format was designed therefore to give participants « series of 
fixed reference strategies in the nope that they would then be able to specify 
more precisely the differences between ther and their own strategy. ‘ore- 
over, there wes some concern that date from the wmstructured interview aight 
not permit comparisons between individuals. By using the structured inter- 
view to provoke discussion tt was hoped to obtain data that could be compare) 
across persons. 

The structured interview involved short peracraphs that were written to 
suggest one of three strategies: jiergest tnterwa!, additive, anc averaging. 
(The paragraphs associated with each of these stretegtes are given tn 
apoendix C). The participants were asked to imacine that they were trying 
to communicate their own strategy to d person who hed just expresset the 
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point of view represented in that caracraph. Specificelly, the perticipants 
were to formulate statements «ich would adequatel, communicate their reesor 
for agreeing or disagreeing it the stratecy portraye). 

Results 

tach participant was sed to describe mis metnod of estimating te room 
Sizes. In every case the process of estivetion was obviows to the barten 
pants. Each chose a specific feature in the plen ani based upon past eaper'- 
ence sade an estimate of its size. In ewery case this feature wes the 
doors which were estimate’ to be | meter wide. Using tits feature as < 
Standard, the participants attempted to estimate the lenoth and with of the 
target room. Rooms that were not simple geometric shapes (!.e., rectangles 
vr triangles) were mentally menipulated so that the surface érea represented 
by the odJ shape was translated into one of these forms and subsequent!y 
analyzed. The ease with which the participants verbalized their estimation 
strategy and the uniformity of this strategy suggests that the task was wel! 
suited to ‘heir area of experience and expertise. 

The participants were also asked to verbalize theiy strategy for estimat' no 
the EI around each surface area. Here, too, the responses were strikingly 
consistent across a!) participants. Although each architect described the 
process in a personalized way, the main components of this description were 
consistent. Through discussion it became clear thet the £] was perceive’ as 
4 measure of their confidence in the point estimate. fore detailed probing 
led to @ list of erte let thet were thougit to be teportant determinants of 
the El. Specifically mentioned were perimeter complexity, experience wi th 
the type of bul Iding being discussed (i.e., sospita!l vs. warehouse apartment 
or retai! store), confidence in the accuracy of the standard or mouulus bein. 
used, and the extent of a person's experience in estimating. Since, for a 
given series of estimations, a]! of these factors wil! be constant except th: 
complexity of the perimeter, these results suvagest that the sanipulation of 
problem difficulty was successful and wnambiquous. An industry accepted 
heuristic of + 105 was mentioned by 3 of the 4 participants, but none of ther 
reported using this type of fixed & strategy; an observation that 1s Support 
by their date (Figure da). 


Insert Figure 6 about here 
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Figure 6a Data Obtained in Experiment 4 
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Figure 6 gives the results obtatned tn experiment 4. Section a of this 
figure lists the oumertcal data by dart“ ‘nant, task and stumlus as wel! as 
the El specified for the total surface area for the three stimult. Section » 
summarizes these oumerica! results by asstyning an inferrec or best approx’ - 
mation mode! to each trie). Also tacluded is the participant's verbel ly 
reported strateqgy; bis We for the process of uncertainty aggregation. 

Utscussion of the aagregation strategy, in sharp contrast to the previous 
two processes, was marked by substantial differences between todividuals. 
Since the specifics of the aggregation stratecies are in general different, 
it wil) be necessary to dea! with the results of each individual rather than 
as 4 group. 

Participant | was not able to vocalize his Streteqy immediately. However, 
his statements became procressively more specific as the interview progressed 
and these statements were 4 consistent with the averaging strategy which he 
ultimately specifted as being correct. This participant felt: (1) that the 
errors over a)! the problems would tend to “average aut e.. an accurate 
estimate with relatively smal! El should save as much of an effect on the 
aggregated EI as a poor estimate whose (I {fs rather large); (2) that al! 
rot len Should contribute tn some ay to the E] of the overall estimate, an! 
(3) that confidence affects the parti ular strategy coosen. 

This last potet was pursued in some detail and although the results are 
not as clear as could be desired, they do provide an interesting glimpse into 
this hereto ore uninvestigated area. The possible strategics were seen by 
this participant as ranging between a purely additive and purely averaging 
approach. in low confidence situation: the fypothes’s that errors wil! 
averuge out fs the least tenable and as «4 result. 2m additive mode! would be 
the proper choice. igh confidence situations are better handled using an 
averaging model. However, this was judged to be true only if the criteria 
surrounding the problem remained approximately tne same. If the expec tations 
increase along with the person's confidence, the sode! of choice would again 
de one of additive aggregation. 

Participant 7 was able to temedtately verbaline ots strategy a5 @ Summing 
mode! that incorporated an add tt one term. it was clear that thts participant 
was using an overtly conscious approach to the task based im part on his 
de et that a normative mode! for accumlated error exists in mathematics. 
Accordingly, he felt that the appropriate aggregate! error was slightly »~ore 


than the suw of the component El's. Since this approace disallows the 
possibility of compensating errors, it is distinctively non-statistical. The 
date from this participant are in exce! lent agreement with his verbal! zed 
strategy. 

This participant felt: (1!) that al! component propless should contribu. 
to the overal! estimated uncertainty, (2) that the averaging and largest 
interval stretegies (Appendix C) were not cautious enough, amu that the 
chotce of strategy was depende st on problem charscteristics and personal 
bert se 

% specifically mentioned an example used to introduce the concept of 
an EI during the introductory phase of the experiment in which he was asked to 
estimate the amount of woney in both ats own and the experimenter's wallet as 
an exemple in which, due to ots lack of confidence in such estimates, an 
averaging @ode! would Se most approyriute. 

Participant J was also temediately able to verbalize the strategy he hed 
used to aggregate his uncertainty estimates. Although ne couched the proces 
in term of percentages, the result was equivalent to a summing mode! and, as 
indicated in Figura 6, this observation is in excellent agreement with bis ot). 

Here, too, the participant's strategy was ‘eplicttly based upon an 
assumed theory of error accumulation en was comistently and consciously 
applied, in other words, 4 veurtstic. 

The interview with this participant was particularly interesting because, 
eln, Sspectfted the streteqgy used throughout the exercise, ‘Ne proceeded to 
arque both for and against its serits more or less concurrently. it Secame 
clear from the ensuing dialog that % would use 4 mode! in which errors 
were s)ilowed to compensate {f asked to do the task again, and (2) confidence 
was the factor thet would primarily influence fis choice of stratew. bee 
cally, he Stated that high confidence situations were amenable to a wode! t):( 
assumed the errors were compensating aru 1..at , conmfisjence situations were 
most appropriately Nandled vy an addit ve aggregation procedure. 

Although participant 4 was sot able to verbalize a clearly defined 
strategy, he did specify Dis approach in enough detal! to permit inference of 
„ bastc mode). Like the other participants, fe felt that al! of the component 
problems should contribute in same way to the overe!l! £1. He also stated 
that the aggregated interval should be larger * » the largest component 
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Grob len interval. These observations sugoest an adiitive anode). The parti< 
pant attespted to specify o weighting strategy by which the component problem 
were incorporated into an I for tee sum. Perimeter complexity and the sire 
of the area being estimate) relative to the total were specified as the 
salient features of the wetohting process but t was not possible to determin 
the actual equation urtnd the interview. 

furing the structured interview this participant wes sten to respond 
to several bogus strategies. ile charectertced the rest tnterval” stratem 
as that of a “widdle-of-the-roader.” further ‘nvestigation on this potnt 
revealed thet be viewed bis strategy as lying between suvming and averaqina 
on a cont nus that could be loosely defined as conservativeness; sueming 
eng the most conservative type of response anu averaging the least comser- 
vative (le., smst risky). 

In this tent tt ts tnteresting to note thet each of the other participants 
indirectly supported this organization. Participant |, for example, charac- 
ter red the summing strategy a5 too conservative when compare! to Ms averag! 
approach while participant telt averaaing was overly confident (i... overly 
liberal) who compared to ts summing strategy. 

In addition, the strateqtes used in the structured interview were 
consistently acceptev as vlaustble. The participants would often respond by 
observing that the bogus strategy was understandable or with a phrese like, 

“I can see what he was doting but . 2 
The results taken as a whole support the following observations. 


(1) Indtviduals can approach the problems of uncertainty aggregation in 
two distinctty different ways. The first, charactert rec by an 
eestly verbalired strateqy and a eh level of conorvence between 
this stratecy and observable behavior gives the tapresston of betng 
a heuristic. The second, as the antithesis of the first, appears 
to be 4 Gore subjective response to the task. 

(2) The method by which subjective uncertainty assessments are combine! . 
whether or not the individual is prone w using 4 heuristic approac’, 
is not fixed. 

(3) The chotce of strategy 1s made on the basis of oroblem and person: 

characteristics Wee include the expectations surrounding (the 


-2f- 


solution, the confidence (ski!) and experience) of the problem 00 
and the individual's persona! response style (conservative, soncon- 
servattve). 

(4) Those individuals who use a Subjective approach to uncertainty 
aggregation are wore likely to respond inconsistently to 4 set oF 
apparently si@tlar problems. 

These four observations imply thet the lack of consistency both between 
and within the participants of the first three experiments can be attribute. 
at least in part, to 4 process that is at best distinctively tadividueltstic 
and at worst highly uns table. 

This ts aot to say that the process is totally unsystematic. There 
appears to be genera! sgreement that each component snould contribute (!.e., 
be tncerporated) to the overall uncertainty specification. In addition, 
all of the participants in this fourth experiment saw the possible strategies 
as tn some way falling along a continuum of progressively more or less risky 
(or conservative) strategies. 

In fact, the participants, irrespective of personal orientation, orvere® 
the strategies similarly. Sueming is seen as the most conservative with 
averaging the least conservative. Three out of the four participants felt 
that increasing confidence would lead to a strategy progressively more simi |. 
to an averaging or compensating error e 

Discussion 

These experiments were undertaken to investigate the process by which 
uncertainty is aggregated. Taken as 4 whole they contribute to or een 
decision making literature % two areas: dating of subjective probabt |i ties 
and the my'ti-attributable utility theory. The content of uncertainty aggreg - 
tion fs not a completely new one. Researceers heave extensively studied the 
process by which subjective probabilities for discrete events are updated in 
tant of new information. The definition of uncertainty used in these 
studies, a subjective probability thet « spectfic event wil! occur, ts 
distinctively advantageous in that « es ten aode! is known to be cormeti ve. 
unfortunately, this definition is not wel! suited to problems that cannot be 
defined in terms of the occurrence of « limited set of possible events. 
furthermore, the fundamental finding of this literature, conservatism, {5 
currently believed to be an artifact of the “book Sag and poker chip” pared). 
(Slovic . Lichtenstein, 1971). By selectine a less restrictive definition 
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of wncerteainty it ‘as Geen possible to investigate uncertainty sggrejetion th 
a new context. Specifically, the focus Sas been placed on the process by 
which uncertainty is agoregeted across tasts. This chance of focus can be 
viewed as en initial attempt to revitalize an area of research that has been 
prematurely set aside. 

In their latest review of the decision literature, Slovic, Fischhoff and 
Lichtenstetn (1977) potent out thet the axtomatic basis of the aulti-ettributad\. 
utility theory (AUT) hes developed rapidly tn the last five years. These 
developments heave led to a fairly cohesive set of axtoms which, if satisfied, 
assure that the decomposition aode! implied by that set of axioms will lead - 
the choice of the best alternative from among a set of alternatives. The 
oasis of this decomposition approac) is 4 sertes of judgments about attributes 
relevent to the overal! problee which are subsequently combined into an 
agoregated estimate of utility for the alternative. Althowg™ there has been 
some interest in (UT's sensitivity to smal! errors itn the spect fication of 
the component judgments (Fischer, 1972). the literature ts surprisingly 
silent on the uncertainty that should be attributed to the agoregated utility. 
Tre procedure followed throughout this capertmental series is a direct 
analoque to the ‘AUT technique and thus the questions of uncertainty reges 
that are examined relate direct!) Lo the developement of this decision making 
too] 

The use of an equivalence interval as ¢ seasure of uncertainty brought 
wit) 1t certain methodological disadvantages. Very little ts known of the 
relationship between (I and common descriptors of probability distributions 
such as the variance. As a result, it Ses not been possible to use statistice! 
concepts as « basis for 4 normative sode!. ‘lore feportantiy, the Sayesian 
mode! used in previous uncertainty aggregation studtes could not be seaning- 
fully applied to this type of problem. Orawing on the results of work in an 
analogous area of research, information integration (Anderson, 1979), two 
models were proposed. These were (1) an adding model, El * 1) El, amt 
(2) an averaginy move? El. = e El_/n. The use of information integration 
concepts was appealing because of tne broad range of topics that have already 
been successfully described «sing these sirgle algebraic een Furtherwore. 
4 staple relationship between component problems and eggrenated uncertainty 


«es expected. ‘ith that in otad « sertes of three experiments was undertaken 
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in which the characteristics of the component probless as ee!! a5 the er 
of component probless were varied. 

The first s=xperiaent used sequences of five component oroblems. As a 
manipulation check some sequences are difficult and some vere easy; tl, 
proved to be wider for the former than for the latter. indicating thet gert 
cipents' uncertainty was aanipulated. |!) various hypotheses generate 
quite different predictions and, though notsy, the date ruled owt « strict 
averaging sode). However, the adding wode)'s ability to account for the 
data was not particularly iepressive. in theory this result could have 
occurred simply because the participents did not understand the problem. 

This exptanation is ruled owt for t= reesons. first, the experimenter 
devoted extensive Lime to the explanation of the problems. as wel! as the 
techniques being used. furthermore, the directions tncorporeted solving 
exact anelogues of the problems to be used during the experiment and ques- 
tions were encouraged and completely answered. Secondly, the deta from cach 
experiment were carefully scanned for response patterns that were not con- 
sistent with the directions (e., the I boundaries Jid not enclose the 
point estimete). These participants consistentiy represented « very smal! 
percentage of the tote! sample. It was necessary to see if this tnadequec) 
resulted from the participants’ inability to eggregate uncertainty reliably 
or tf it was the result of the burden of having to ea) with five component 
problems. To exemine this, „ second experiment was performed using sequences 
that head only three component problems. As in the previous study problem 
difficulty wes used 45 @ menipulation check an! yielded similar results. 

The additive mode! wes more strongly supportec in this experiment though the 
level of notse was still fairly Me. 

Subsequently, a two-part expertment was conducted. first, Gro component 
sequences were used; the results were similar to those ‘nm the three component 
case. in the second part of the experiment, the sequences hed four components. 
The objectives were (1) to determine whether the additions! procedure in 
txperiment | had been responsible for the relatively poer performance and 
(2) to find out if participants could deal as successfully with four as they 
did with three, 

in spite of their highly vertable responses, adout helf of the participants’ 
responses were fairly wel! described ty @ simple additive aode!. figure | sho 
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the percentage of rations that fel! outside the criterton interwa! (!.0 + 0.' 
as @ function of the number of components (e., error percentage vs. length 
of sequence). There was decreasing ability to cope with the task as the 
eumber of component vroblems increased. Leteriorating prediction of the 
additive mode! was the most notable in the transistion between four and five 
component sequences. However, the results of these experiments were sot 
completely satisfactory. Although an additive mode! provides some predictive 
power, the relatively Sigh variability of the date suggests that the process 
involves sore vertebles than had previously been expected. The fourth 
experiment was conducte! in an attempt to reduce variability by controlling 
for the participants’ estimating skil!. 
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Specifically, Experiment ned expert estimators in 4 task consistent 
with their area of expertise. Although the results demonstrated that 
individuals can be stable aggregators of uncertainty, it was also apparent 
that this occurs only when « satisfactory heuristic for the task can be 
developed by the individual. This is seen as « special case of 4 core 
genera! subjective response process. 

Figure 0 Shows @ process ortented summary of the uncertainty estimation 
and aggregation process obtained in Experiment ‘. [t 1s divided into those 
areas or cherecteristics that ere chjective and those that are subjective. 
Since the discussion that follows depends ratmr heavily on the figure for 
clarity, the reader is advised to use the diagram in conjunction with the 
following test. As expected, the estimation of component uncertainty 15 
affected by the characteristics of the problem. However, the confidence of 
the individual also has a Jirect ect on the size of the interval. This 
level of subjective confidence has at least two primary sources of input. 
The first, environmental cherecteristics, tavolves externa! constratnats such 
os time, the feportance of the decision, limitations on the avatledle tools 
for solution, and the expectations of the people surrounding the decision 
maker. In addition, the background of the individue! directly influences his 
perceived confidence. Al! of the people interviewed saw the level of tnetr 
experience and their previous successes/failures as important determinants of 
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their confidence. Although this diagram shows no feedbach loon, it should bh 
obvious that the decision process by which component uncertainties are 
spectfied is repeatedly appliet unte every problem has been analyzed. 
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The decision process by which uncertainty is aggregated is seen as 
requiring an aggregating strategy am s set of component uncertainties. 
Although other factors may be involved, it is believed that the fundamental 
determinant of the choice of sggregating strateay ts 4 variable that is in 
some sense unique to every individual; his persona! response to uncertainty. 
This variable appears to be a function not only cf his level of confidence, 
but also of environmental characteristics. This oreaniration of the strate 
selection process is intended to emphasire its dynamic nature. Cased upon 
persona! characteristics anc externa) demands the decision maker can be 
described as selecting the appropriate strateay from a set of theoretica! ly 
acceptable strategies. Al) participants in Experiment © were able to accept 
the bogus strategies as plausible, indicating that the set of possibile 
strategtes ts not limited to the one being usec by an individual even if hi« 
strategy could best be descriled as 4 heuristic. 

One way to interpret the consistency that emerges when a heuristic is 
used am the observed inconsistency of both the between and within participant 
data from Experiments |, 2 and t to view this section of the process as 
underdetermined. This suggests that given the same input on several occasion 
the resulting aggregating strategy cannot be expected to be the same. The 
existence of « heuristic may allow the individual to effectively bypass the 
“choice of strategy” section of thts mode! thereby reducing this source of 
inconsistent responses . 

As indirect support for this typothesis the data from Experiment ib were 
reanalyzed. A strategy for each string was Herred from the data so that 
the participent's consistency could be roughly quaged. These inferred 
strategies were reasonably stable for six of the 20 participants, while |4¢ 
nad oatterns that were inconsistent. 

Experiments |, 2, and | did not attempt to contro! for any of the 
subjective variables. It is therefore not surprising thet the results do no: 
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allow @ strategy to be meer tee specified. in fact, t seems apparent 
that a complete specification of strategies wil! require ¢ mul tidteensiona! 
approach that accounts both for the vertebles surrounding the decision (i.e. 
problem) snd the decision maker. The work presented here suggests that man's 
subjective uncertainty aggregating skills are rather limited and that his 
chotce of strategy gost likely involves 4 complicated interaction beteren (he 
objective circumstances surrounding « decision and his awn interna! 
characteristics. 

It appears true that man should not be used a5 4 standard in questions of 
wmoertainty egoregation. ulti-attribute utility tern binges upon the 
concept that man is very limited as an information integrator. This series 
of experiments would suggest that the uncertainty to be associated with the 
aggregated utility ts probably best handles by 4 simple and de fendatie 
weuristic. One aspect of man's approach thet ts worthy of attention is thet 
his response is not stereotyped. This suggests that the choice of heuristic 
for HAUT should de flexibly defined tn @ way that would lo the user to 
assign an uncertainty that is responsive to the needs of the situation. 


5 


ae ferences 


Anderson, 4. 4. Functiona) seasurement and psycholoplysica! judgment. 
Psychologica! Review, 1970, 77, 153-170. 

Anderson, M. M. . Alexander, G. &. Chofce test of averaging hypothes's for 
information integration. Cognitive Psychology, 197!, 2, /!)-32!. 

each, U. 4. Expert judgment about wecertainty: bees tan decision waking i 
realistic settings. Organizational behavior end l\ivmen Performance. 
1975, 14, 19-59. 

Beach, I. &., Beech, 6. M., Carter, 4. 6., i Garclay, S. Five studies of 
subjective equivalence. Organizatione! Yehavior ang human Performance . 
1974, 12, 351-37). 

Beach, I. . & Solek, F. Subjective judgrents of acceptabie error. 

Orgari zations! venavior and iivman Performance, 1969, 4, 242-09). 

Coomps, C. H., Gawes, &., & Tversky, A. Sathematica! psychology. Englewood 
Cliffs, d.d.: Prentice-Hall, 1970. 


Ellesberg, U. isk, ambiquity and the sawage axtoms. Quarterly Journa) of 
Economics, i361, 7S, 6¢3-669. 
Fischer, G. \. Four methods for assessing mylti-attribute utilities: An 


experimental validation (Tech. Rep. 237230-6-T). Amr Arbor: (iversi< 
of itichigan, Engineering Psychology Laboratory, 19/2. 
Holwes, T. H., 2 Rane, R. H. The soctal readjustment scale. Joyrna) of 


Psychomatic Research, 196/, I}, 2) 3-218. 
Laestadius, J. k. Tolerance for errors in intuitive mean estimations. 


Organizations! Sehavior and man Performance, 1979, 5, 1-1-1248. 
Luce, T. D., ett te, NH. Gomes and decisions. «ew York: John Wiley anc 
Sons, 1957. 


Peterson, C. ., ° Geach, .. X. Man as an intuitive statistician. Psycho)» 


Bulletin, 1967, 68, 29-46. 
Peterson, C. R., & PRillips, . 0. Reviston of continuous subjective probe 


bility distributions. [£66 Transactions on tiymen Factors in Llectror ic: 
1966, i. 19-22. 
Phillips, . O., Mays. 4. L., § Edwards, W. Conservatise in complex prove 


bilistic inference. IfEE Transactions on Mywmen Factors in Electronics. 
1966, „ 7-18. 


-2?- 


Shanteau, J. C., & Anderson, |. 4. Test of a conflict aude! for preference 
judgment. Journe! of Mathenstica! Psychology, 1969, ., 312 325. 

Slovic, ©... Pischhoff, ©.. 5 Lichtenstein, 5. Uenaviora! decistor theory. 
fnnya | Review of Psychology, 1977, 28, 1-39. 

Slowic, ®., & Lichtenstein, S. Comparison of dayesian and regression 
approaches to the study of information processing tn judgment. Organ) ze- 
tiona! Behavior and Hyman Performance, 197!, 6, (49-7%4. 

dyler, A. . le, „ 6 Holmes, T. H. Seriousness of ness, rating 


scale. Joyrna) of Psychosomatic Research, 1965, |), 363,374. 
Yates, . & Zukowski, L. G. Characterization of ambiguity in decision 


making. Geheviorsl Science, 1976, 21, 19-25. 


‘Tis research was partially supported by Office of java! Research 
Contract 400014-76-C-9193 (Terence . iittchel! end Lee Roy Seach, invest) geto 


Percentage Problems 
The problems ta “cur gert category wil) be steple per cent problers. 


Construction Problems 


In the next section the problews wil! be simtlar to those 4 construction 
estimstor would be asked to solve. 


Try and place yourself in the following situation: 

fou are an estimator for a construction firm. The Did is due in 20 
minutes and the boss wants some figures fast. Glancing over your notes 
you come up with the fol lowing: 

Supermertet Prob | ems 

This t'me we want you to picture yourself in 4 supermartet. 

As you solve this set, try to keep the followina scene in wind: 

You are waiting in the check-out line at your favorite supermarket. You 
olan to pay cash but you've gone wild on some of the specials and you ™ 
not have enough to cover the ofl!. 4 quick check of the basket reveals: 
nt Problems 

Awericans are often concerned with their ett (overweight, uncerwignt, 


idea! weights, etc.). In the next section you wi!! awe « chance to 
estimate some fuman body welgits. 
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APPLIDIX 8 
CAPERIMEATAL [aSTRUCTION 


‘My name ts Clerk Johnson. | af doing ay dissertation in decision making, 
and this will be one of the last experiments that | am running for that 
dissertation. If you have read the preliminary blurt on the front page of your 
booklet you Neve « pretty good idea of what we are going to do. This really 
is 4 very straightforward experiment. [t essentially invelves making 
quantitative judgments and then supplying some additional! information about 
these judgments . 

The research is being funded by the levy. Their primary interest in this 
field ts in taproving their ability to aeke distributed judgments. Theat (+. 
somebody sakes judgment “X" and saebody else sakes judgment “Y" and somehow 
el) of it comes together to make a jecision. We are developing tecimiques 
which we can test, to see whether or rot they work. This ts going to be one 
of those techniques. 

If you are 4 student, or if you have beer a student, | am sure you \ ve 
sad the expertence of taking an examination where the answers were a single 
oumber or some very specific thing. ent tees you know sore about wat you 
ere doing then thet one under can indicate. for instance, t would be nice 
to be able to get wre credit for the right answer (because you know for 
certain thet it is correct, that it is the right answer) than someone who 
just gets thet answer basically didn't seve any idea in the world what they 
were doing. | know that es Neppened to me. 

This whole task s quantitative in nature and there are vast di fferenc: 
in peoples’ feelings about doing quantitative tasks. 50, | ‘ave 4 three-point 
scale that | would like you to rete yourself on by writing a number between | 
and 3 on the front page of your booklet. In particular, 'f you are an engineer 
or 8 hard sctence major or an accountant, 4 mathematician or for whatever 
reasons your datly fare is cumbers and you enjoy e you would write a i. if 
you are one of those people who is not particularly pleased when the end of 
the month rolls around and you are forced to balance your check-beok, then yo 
would wee a |. And if you ere Somewhere between the two cutremes then yor 
would use @ 2. So, would you write something down for yourself (pause). 
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O. K. „ now as | Say, Chis tnwolves almost exclusively quantitative tess 
that is. ganigulatiag numbers. One of the things that | am sure you wil! 
notice as we do them is that there are very easy brot en and more difficult 
wes. de meant for it to be thet way, and since | am not going to give you as 
much time a5 you would like to have to solve these probiers, | want to encourage 
yOu not to become discouraged about not being able to answer the problems 
exactly. Also, | want to caution you about Secoming lazy when the probless 
are simple. Lots of people get to 4 simple problem, figure “Oh boy, this ts 
really easy.” and then ake big aistakes. So, please do use a}! of the time 
that | give you to actually work on solving the problem. low, obviows!y, tin 
ooint estimate, that single number, 1s not the only thing thet we want. \&. 
to give you an idea of how thet works we wil! work as 4 group to solve « 
simple judgment task and | taten that that wil! demonstrate this simple 
orocedure to you. as everyone lived in this state long enough to know thet 
tere is a city called Spokane? Is there anybody who is not awere of thet? 
0.4 , shy don't you write on the front page of your booklet 4 qumber that 
ec tes your best guess as to how many wiles it is from here to Spokane. 
je won't spend hours on this, because it is meant to be just tind of © quick 
little exercise. Alright, sow, what are some of the ers that we have’ 
("300, 200, 250"). “hy don't we say 15 as @ group, 0.5.7? | want to as 
that it ts not cructal to e what the correct answer is, this | just 4 question 
of making judgments and trying to ‘ive with them. If | came to you and satu 
nove an almanac bere and it says that 500 wiles is the distance to Spoke 
if this ts the true answer (500 atles), what would you say about your estime 
(325 wiles)? Oo you tie that this estimate 325 wiles) is 4 good estimate 
of 500 atles if 500 is the correct answer? ©, probably not. A’rignt, let 
we come down here to the other extreme and say, “What if | told you thet it 
was 130 etles to Spokane?” lould you fee! tha. 125 a5 5 guess was 4 good 
enough quess of 330 as the correct answer? 0.%., let's try ft sommsere tn 
betwoen. ‘tow about 4007 If 400 is the correct answer woulu 175 be & qo 
yuess? or a bad quess? Good? tad? low we are getting into Fine of @ grey 
ores. And | think ft is clear to you thet there are certainly quwers for 
which your guess is a bad guess and there are certainly other sumbers for 
which your quess {s pretty good, and somewhere tn between there has to be 
number thet is the dividing line betwen those to. 


-s2- 


{ snguid atso emmastre that there are tremendous tndividue) 41 fferonce: 
so | don't expect that you tndtvidually would a!) eoree as to what auer 
should be. | expect, fn fact, that there will be differences. Gut let us + 
es @ group that (t was 390. ante you feel that it showld be 420. That 
doesn’t bother me. Gut somewhere up here there hes to be 4 number that is 
your boundary between these Wo regions. ‘fe can do that on the other end «5 
well. If 100 ts the right answer, 325 's probably not s % estimate. ‘hat 
would you Say was the number such thet t was the bounvery between these two 
rections? Your own number, hat is it? “270,” any others? Is that s pretty 
qoed one? Alrianht, now what ‘oes this sean? Wel), whet this says essentia’’ 
is, that . the true answer lies anywhere in this toterval defined between 
270 and 390, then it is your personal feeling that your quess of (75 was in 
the ballpark, t was close enough. If it is anywhere outside of that recion 
then it would be your statement thet vou had Asse the problem. Are there 
any questions atowt that? e call that a “ballpark estimate” because you are 
basically telling me what the ballpark fs arownd your estimate. And as | say. 
these things can really wary. \hen | did it, | thought 254 was the answer. 
And | think 260 is ay upper limit and 2.5 ts ay lower limit. ow, obviously 
| have a such smaller interval than you do. Perhaps that is «4 reflection o° 
the fact that | drive to Sookane every break. And if | don't Enow what the 
distance to Spokane is within « oretty right space, then | would be disappoty tod 
in myself. That should indicate to you that when you know sore about something 
then you have higher expectations. ‘then you know something about one of thos 
probless, it should be reflected in the width of tlese intervals. den vou 
don’t know anything, then you have to tel! me that. Some of what | have said 
is suamartzed on the next page, so let's turn the vege and | wil! read with 
you. 

The dal bent estimate describes your personal expectation: 
concerning the accuracy of an estimate. 


The ballpart estimate is an tnterval around your estimated 
answer such that if the true answer lies outside of that 
interval you would feel thet you had missed the problem. 


The da! ert estimete can have rero width and to demonstrate the! cons: 
this problem: “SO of 402° (written on boar!). If I provided three blarks 
like this and | told you to put the estimate here. at oumber would you 
Wehn choose? (201) probably, that would be a cuess too, and ft woul) © © 
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expectation that | wouldn't uss thet. In fact, | fee) that thet really oug 
to be the correct answer. To indfcete that | would put "9 as ay lower boun’s » 
on the problem and 20! as sy upoer boundary. Clearly, | am saying, “! a 
absolutely certain that this fs the correct enswer. If it ts anything 
different than this, | heve missed the problem.” ‘ow, | expect that there 
will be problems for which this will te true for you. There are some very 
easy problems. When that is true, please do not put zeros in these blanks. 
It ts the difference between these tuo numbers which refers to rero width, a 
the values themselves. 

The last statement on your page says that the interval need not be | 
Sy ometric. that that means can be demonstrated by another example. for 
instance, let us say that we went to the King Dome, and we knew thet, for the. 
event, (5,000 was the maximum seating capacity. If | sail, “Let's estimate tie 
number of people here.” | would put w the three Dlanks again and you signt 
say “63,000 people.” Secause of the extra information that it cannot be greatc: 
than 65,000 you neve an er boundary. You know that t cannot be greater 
than 65,000. So let's just say that you put that in the blank for upper 
boundary. However, the lower boundary is unclear. it {s very herd to estin. 
a crowd, 80 you aight say «5,000 for a lower limit. ‘lew, in terms of the 
number line, if this ts the estimated value (63,000) then you are really mak! 
„ statement thet looks sometiing ithe this: “. le other words, 
your estimate fs certatrly not tn the Asse of this interval. That is fine. 
because what I really want to know is what you know about your answer. And | 
am trying to allow you as such flexibility to express that as | can within 
this peredian. Ave there any questions? ‘low, as | satd this ts « part in» 
sertes of experiments and following expertarnta! rigor we have to contro! 
things a little bit. Thet means that | have to ask your cooperation in 
working together through these problems. It is not possible for me to just 
turn you loose to do them. There are two ways that | fo that. One, fs these 
little cards thet | wm handing out. These are called problem guides, for | 
of « better term. when you start a series of problems you should place these 
guides suce that only one problem is exposed at 4 time. Your card ought to 
have something like low, estimate, and high eritten on it fn the places thet 
line up with the three blanks for each problem. Test this yourself anu if 
you don't like the way it lines w turn your card over anc make your own Gu. 
ot the answers will be writcen on the white boollet paper, Wl On fe5e 
cards. 
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in addition to st, | am going to control how long vou are able to wor 
on a problem and in what order. in o.ser words, | will time you for 29 
seconds to solve the problem that we are working on. fy solving | mean in 
your teas. You aren't going to use the calculator and you are not to we 4 
penct! and paper. Just look at it and do the best you can in 20 seconds. 
Then | will say “stop.” [| need an answer in every blank so | wil! weit for 
you @ Short te if necessary while you get something down. nce | say stop 
please proceed with a haste to write something down. And then | wil! say 
“Let's go on” and | will give you b seconds to put an interval sround the 
point estimate. . it should be obvious that | am very interested in 
these interwzis because | am giving you sore time to put them down than | 
am to estimate the answer. And | vowld appreciate it if you would really 
think about this as your best statement about what you know. It is not 
cructa) to % that you don t know very auch. if you don't know much, that i+ 
just fine. If you know a lot tha is fine, too. What | need is some kind 
of oersona) consistency; thet you do express what you know. O.4., why don't 
we turn the page and use the problem guide to highlight just the first problem. 
| have a series of six problems that you can use as trial! probless. fe wil! 
go through them in the standard procedure | Seve already described end at 
the end of that time we wil! have some more instructions. So let's begin an 
1 wil!) give you 20 seconds to work this problem. [| want the estimate and 
that goes in the siddle column. Stop. These are practice problems and you 
are free to ask questions while we are working them. Although at times it 
seers tedious. | soul’ appreciate it if you would follow ay directions 
exactly. In other words, please do not proceed to write an interval eround 
the aumber until | tell you to uw so. | went you to spend 20 seconds makin, 
an estimate and i am going to try to force you to do that. O.4.? Now, | 
will otve you 30 seconds to put am interval around tis. Stop. | realize 
that this particular problem is @ very easy one ami that it may not take iC 
seconds. In addition, | should point out that there are times when efter °° 
seconds you have put something down and in the remaining 30 seconds you 
realize thet your answer is totally fellecious, thet there is no way it cou . 
conceivably be correct. dell, sometimes whet people heve done is thet they 
have their answer on the number line righ sere am in the process of the |) 
seconds they realize it is wrong so they put their interve! up Sere. Pleas. 


e change this answer ff you reelize thet you are totally evong. However, | 
we hove passed @ problem, if we are completely done with one and you realize 
thet a!) three of your anewers were in error, then leave it alone because | 
would rather that you concentrated completely on the problem that we are on. 
Let's mowe on and Go the neat problem now. | wil! otve you 2) seconds to mak 
an estimate. ‘Stop. Put something down. ‘lo core thinking, just write 
something. It gets better as we go along. It really becomes easier. Al! 
right, cow | wil! gfve you 9 seconds to put an interwe!l around that. Stop. 
ext problem. 

Repeat procedure for three problems. 


That is half the practice problems, are there any questions 
about this? 


Repeat for three wre problems. 

All right, thet fs basically what we are goiny to be doing. That ts aise 
@ pretty good sampling of the difficulty range. For any of you who find this 
a difficult task, | heave tuo hints. The first is one of the most important * 
things you can do, figure out how many digits there are goine to be % your 
ander before you get to the decimal point. Sometimes it ts worth spending 
15 seconds just figuring that out. Once you know that, then start working 
the digits and you'!! find you will te sore comfortable with what you are doing. 

ow let's turn to the next page. What follows now fis an attempt to 
explain to you what to expect in the experiment. The kind of procedure that 
we will follow. These oroblems always come in groups and there ts always « 
txt preceding the grow. This is an caample of such s text. ‘Then you get ty 
the text you are free to reac it. Then repeat so that at some potnt fn the 
experiment you won't need to read it anymore, but please put yourself in this 
scene. When I tel) you to turn the page, which is now, (this fs the demonscr- 
tion) you will find two oroblems. ‘le solve these as we did the practice 
problems, in a sequence. [ will then ask you to ad! the numbers that you her 
made as estimates on these calculators. 

For those of you who did sot bring your on calculators, I will explatne 
how to use the ones we wil! provide for you. You type in the number. | tit 
the first problem's answer is 17.07 and you type in 2 plus and then the 
second number ond then you oress the equals sign to cet your answer. The 
calculators ere very tnexpensive an tend not to wort the way you expec’ toe 
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to. That is why | nanded ut « few extras. If you %%. one that has at the 
top a penciled-in number, like 4 4 or a 2, then in the past the aumbers list | 
have not appeared in the display after being ounched. So you ere reasonably 
certain that the calculator wil! work correctly if you just make sure thet 
the number you punch does in fact appear in the display. O.K., those of you 
who have your oun calculators have 4 great edvantage. 

Then clearly you wil! te adding only the estimwtes and that sumer goes 
dam bere on this line. When you do thet you are free to turn the page. 
(We'}) do that now) and on the following page there will always be the tro 
remaining blanks for the sum that you e used the calculators to find. <» 
it may not be temediately obvious, tut | am sure ft will seem obvious as | say 
it, that although you use the calculetors to adu these two numbers wp and 
you know that your sum is correct for the two numbers that were in the blanks, 
it is mot necessarily true that those numbers were correct. That s to say, 
you have two lens which you add wp, but the Dlanks were each 4 quess, so the 
true valve that would appear here is not necessarily the sum of these two 
values. het | want to know is what is the interval that surrounds this valu 
thet makes t correct in your opinion. in other words, it is the same 
operation in terms of finding the intervals, but now you are going to 
talk about your answer as against the true sum of the two orobless. Is fret 
clear? The interval wil! always go on the page following the page with the 
problems, then there wil! be another little text and it sterts again. This 
process 1s repeated severe! times. So sow you show!l! a}! have somot)ying 
talking about weight. That is the first of the sequences of orobdlers. Thi. 
completes the instructions, and if you save any questions at al), | would like 
you to ask them now. 
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PARAGRAPHS SUGGESTING: (1) LARGEST IVTERWAL, (7) eee. AND 
(3) AVERAGI“G STRATEGIES 


(1) The approach | used in solving the problems wes to use the biggest 
interval from among the component problems as ay fina! interval. | was very 
careful in essiqning ay dounderies to „ with so | didn't see any reason 
to readjust them afterwards. Since the largest interve! represents oy 
weakest estimate | fee! it should be used as the indicator of ay overa!! 
expected accurecy. 


(2) It seems obviows to me that if ade the estimates you aust ~'so 
add together the intervals surround! estimates. Since each prob les, 
has some uncertatuty associated with it you have no choice but to carry al 
of it whether large or small into the final taterval estimation. Clearly, 
since | couldn't use penct! end paper | wasn't able to follow this exactly 
but | think it accurately summarizes ay oastc approach to the problem. 


C3) . s 1 see the process of estimating as in some way 
self correcting. Sometimes you're high, sometimes low, so that when you 
as your largest interva! nor as good 4s 
ng thot the final interva! 
should be like an average of the intervals in the tndtvidua) problers. 
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